For the following series, find an appropriate Box-Cox transformation in order to stabilize the variance.
usnetelecusgdpmcopperenplanements## [1] 0.1919
The first two series don’t really need a transformation.
Why is a Box-Cox transformation unhelpful for the
cangasdata?
## [1] 0.57678
Here the variance of the series changes, but not with the level of the series. Box Cox transformations are designed to handle series where the variance increases (or decreases) with the level of the series.
What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?
retaildata <- readxl::read_excel("data/retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349873A"], frequency=12, start=c(1982,4))
autoplot(myts)From visual inspection, a log transformation would be appropriate here. It also makes sense, as retail expenditure is likely to increase proportionally to population, and therefore the seasonal fluctuations are likely to be proportional to the level of the series. It has the added advantage of being easier to explain than some other transformations. Finally, it is relatively close to the automatically selected value of BoxCox.lambda(myts) \(= 0.128\).
If you have selected a different series from the retail data set, you might choose a different transformation.
For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.
dole,usdeaths,bricksq.
lambda <- BoxCox.lambda(dole)
dole %>% BoxCox(lambda) %>%
autoplot() +
ylab(paste("BoxCox(# people,", round(lambda, 2), ")"))The data was transformed using Box-Cox transformation with parameter \(\lambda = 0.33\). The transformation has stabilized the variance.
There is no need for a transformation for these data.
lambda <- BoxCox.lambda(bricksq)
bricksq %>% BoxCox(lambda) %>%
autoplot() +
ylab(paste("BoxCox(# mln bricks,", round(lambda, 2), ")"))The time series was transformed using a Box-Cox transformation with \(\lambda = 0.25\). The transformation has stabilized the variance.
Calculate the residuals from a seasonal naïve forecast applied to the quarterly Australian beer production data from 1992. The following code will help.
Test if the residuals are white noise and normally distributed.
##
## Ljung-Box test
##
## data: Residuals from Seasonal naive method
## Q* = 32.3, df = 8, p-value = 8.3e-05
##
## Model df: 0. Total lags used: 8
What do you conclude?
The residuals are correlated: the Null of no joint autocorrelation is clearly rejected. We can also see a significant spike on the seasonal (3rd lag) in the ACF. There is considerable information remaining in the residuals which has not been captured with the seasonal naïve method. The residuals do not appear to be too far from Normally distributed.
Repeat the exercise for the
WWWusageandbricksqdata. Use whichever ofnaïveorsnaiveis more appropriate in each case.
##
## Ljung-Box test
##
## data: Residuals from Naive method
## Q* = 146, df = 10, p-value <2e-16
##
## Model df: 0. Total lags used: 10
Residuals are correlated as shown by both the LB test and the ACF. They seem to be normally distributed. There is considerable information remaining in the residuals which has not been captured with the naïve method.
##
## Ljung-Box test
##
## data: Residuals from Seasonal naive method
## Q* = 233, df = 8, p-value <2e-16
##
## Model df: 0. Total lags used: 8
Residuals are correlated as shown by both the LB test and the ACF and do not appear to be normal (they have a long left tail). There is considerable information remaining in the residuals which has not been captured with the seasonal naïve method.
Are the following statements true or false? Explain your answer.
- Good forecast methods should have normally distributed residuals.
Not true. It is helpful to have normally distributed residuals because it makes the calculation of prediction intervals easier, and it means that least squares estimates of parameters are also equivalent (or close to) maximum likelihood estimates. But it doesn’t make the forecasts better. If the residuals are not normally-distributed, one way to produce prediction intervals is to use a bootstrapped approach.
- A model with small residuals will give good forecasts.
Not true. An over-fitted model will have small residuals (relative to other models), but will probably not give good forecasts.
- The best measure of forecast accuracy is MAPE.
Not true. MAPE is useful in some circumstances — for example, in comparing forecasts on different scales, or with different units, and is relatively easy to understand. But it is not appropriate if the data includes zeros, or if the data has no natural zero.
- If your model doesn’t forecast well, you should make it more complicated.
Not true. Some things are hard to forecast, and making a model more complicated can make the forecasts worse.
- Always choose the model with the best forecast accuracy as measured on the test set.
Not true. Imagine the test set has only a single observation, or a very small number of observations. We don’t want to select a model based on a small test set. A better approach is to use time-series cross-validation, which is based on a much larger set of test sets. Later, we will learn about the AIC statistic which is an alternative way to select a model and is often more helpful than a simple test set.
For your retail time series (from Exercise 3 in Section 2.10):
- Split the data into two parts using
retaildata <- readxl::read_excel("data/retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349873A"], frequency=12, start=c(1982,4))
myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)
- Check that your data have been split appropriately by producing the following plot.
- Calculate forecasts using
snaiveapplied tomyts.train.
- Compare the accuracy of your forecasts against the actual values stored in
myts.test.
## ME RMSE MAE MPE MAPE MASE ACF1 Theil's U
## Training set 7.773 20.246 15.957 4.7028 8.1098 1.0000 0.73851 NA
## Test set 55.300 71.443 55.783 14.9010 15.0820 3.4959 0.53152 1.2979
The number to look at here is the test set RMSE of 71.443. That provides a benchmark for comparison when we try other models.
- Check the residuals. Do the residuals appear to be uncorrelated and normally distributed?
##
## Ljung-Box test
##
## data: Residuals from Seasonal naive method
## Q* = 624, df = 24, p-value <2e-16
##
## Model df: 0. Total lags used: 24
The residuals do not look like white noise there are lots of dynamics left over that need to be explored. They do look close to normal although the tails may be too long.
- How sensitive are the accuracy measures to the training/test split?
The accuracy measure are always sensitive to this split. There are better ways to check the robustness of the methods in terms of accuracy such as using a rolling window (possible in this case as we have lots of data) or ts.cv.
visnightscontains quarterly visitor nights (in millions) from 1998-2015 for eight regions of Australia.
- Use
window()to create three training sets forvisnights[,"QLDMetro"],omitting the last 1, 2 and 3 years; call these train1, train2, and train3, respectively.
train1 <- window(visnights[, "QLDMetro"], end = c(2015, 3))
train2 <- window(visnights[, "QLDMetro"], end = c(2015, 2))
train3 <- window(visnights[, "QLDMetro"], end = c(2015, 1))
- Compute one year of forecasts for each training set using the
snaive()method. Call thesefc1,fc2andfc3, respectively.
- Use
accuracy()to compare the MAPE over the three test sets. Comment on these.
First we will copy the actual data into a variable. Then we can do an accuracy comparison.
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.020575 1.05406 0.85999 -0.22503 8.0937 1.00000 0.060460
## Test set 0.450123 0.83107 0.56752 3.64361 4.9319 0.65991 0.080744
## Theil's U
## Training set NA
## Test set 0.44665
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.010551 1.05869 0.86269 -0.32467 8.1201 1.00000 0.064916
## Test set 0.602491 0.91832 0.70032 5.10674 6.1803 0.81179 0.188457
## Theil's U
## Training set NA
## Test set 0.54174
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.017609 1.06535 0.86906 -0.25581 8.1712 1.00000 0.065732
## Test set 0.388349 0.90332 0.72835 2.98453 6.6901 0.83809 -0.011077
## Theil's U
## Training set NA
## Test set 0.55221
This should give similar results to this consolidated results table.
ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.1637836 1.742687 1.360271 0.4384347 7.357322 1.0000000 0.06643175
Test set fc1 -1.3010774 1.301077 1.301077 -6.9956861 6.995686 0.9564839 NA
Test set fc2 0.08383478 1.387447 1.384912 -0.4063445 6.589342 1.019346 -0.50000000
Test set fc3 0.06202858 1.132896 0.9294135 -0.237857 4.425934 0.6738562 -0.51548610
The lower MAPE value for “fc3” indicates a better result when we use the previous 3 values for the snaive() prediction.
Use the Dow Jones index (data set
dowjones) to do the following:
- Produce a time plot of the series.
- Produce forecasts using the drift method and plot them.
Let’s assume we want to forecast the next 5, 10 and 15 values.
dowfc1 <- rwf(dowjones, drift=TRUE, h=5)
dowfc2 <- rwf(dowjones, drift=TRUE, h=10)
dowfc3 <- rwf(dowjones, drift=TRUE, h=15)Then we can plot these values.
autoplot(dowjones) +
autolayer(dowfc1, PI=FALSE, series="Drift 5") +
autolayer(dowfc2, PI=FALSE, series="Drift 10") +
autolayer(dowfc3, PI=FALSE, series="Drift 15") +
xlab("Time") + ylab("Closing Price (US$)") +
ggtitle("Dow Jones index") +
guides(colour=guide_legend(title="Forecast"))
- Show that the forecasts are identical to extending the line drawn between the first and last observations.
We can plot the forecasts in a different order, so the shorter forecasts are superimposed, showing the lines are the same.
autoplot(dowjones) +
autolayer(dowfc3, PI=FALSE, series="Drift 15") +
autolayer(dowfc2, PI=FALSE, series="Drift 10") +
autolayer(dowfc1, PI=FALSE, series="Drift 5") +
xlab("Time") + ylab("Closing Price (US$)") +
ggtitle("Dow Jones index") +
guides(colour=guide_legend(title="Forecast"))
- Try using some of the other benchmark functions to forecast the same data set. Which do you think is best? Why?
The time series isn’t seasonal, so the seasonal naïve method is not viable. However, we can use the mean and naïve methods.
dowfc1 <- meanf(dowjones, h=5)
dowfc2 <- rwf(dowjones, h=5)
dowfc3 <- rwf(dowjones, drift=TRUE, h=5)
autoplot(dowjones) +
autolayer(dowfc1, PI=FALSE, series="Mean") +
autolayer(dowfc2, PI=FALSE, series="Naïve") +
autolayer(dowfc3, PI=FALSE, series="Drift") +
xlab("Time") + ylab("Closing Price (US$)") +
ggtitle("Dow Jones index") +
guides(colour=guide_legend(title="Forecast"))The three values will be very different here. The Mean will use the data set, so is unlikely to follow the current trendline.
Consider the daily closing IBM stock prices (data set
ibmclose).
- Produce some plots of the data in order to become familiar with it.
- Split the data into a training set of 300 observations and a test set of 69 observations.
ibm.train <- window(ibmclose, end=300)
ibm.test <- window(ibmclose, start=301)
autoplot(ibmclose) +
autolayer(ibm.train, series="Training") +
autolayer(ibm.test, series="Test")
- Try using various benchmark methods to forecast the training set and compare the results on the test set. Which method did best?
h <- length(ibm.test)
m.f <- meanf(ibm.train, h=h)
rw.f <- rwf(ibm.train, h=h)
rwd.f <- rwf(ibm.train, drift=TRUE, h=h)
autoplot(ibmclose) +
xlab("Day") +
ggtitle("Daily closing IBM stock prices") +
autolayer(m.f$mean, col=2, series="Mean method") +
autolayer(rw.f$mean, col=3, series="Naïve method") +
autolayer(rwd.f$mean, col=4, series="Drift method")## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 1.6604e-14 73.615 58.722 -2.6421 13.030 11.521 0.98958
## Test set -1.3062e+02 132.126 130.618 -35.4788 35.479 25.626 0.93147
## Theil's U
## Training set NA
## Test set 19.055
## ME RMSE MAE MPE MAPE MASE ACF1 Theil's U
## Training set -0.28094 7.3028 5.097 -0.082629 1.1158 1.000 0.13511 NA
## Test set -3.72464 20.2481 17.029 -1.293917 4.6682 3.341 0.93147 2.9735
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 2.8705e-14 7.2974 5.128 -0.025301 1.1217 1.0061 0.13511
## Test set 6.1081e+00 17.0670 13.975 1.419201 3.7079 2.7418 0.90459
## Theil's U
## Training set NA
## Test set 2.3611
In terms of accuracy measures on the test set, the drift method does better.
- Check the residuals of your preferred method. Do they resemble white noise?
##
## Ljung-Box test
##
## data: Residuals from Random walk with drift
## Q* = 22.6, df = 9, p-value = 0.0073
##
## Model df: 1. Total lags used: 10
##
## Box-Ljung test
##
## data: residuals(rwd.f)
## X-squared = 22.6, df = 9, p-value = 0.0073
Residuals look relatively uncorrelated, but the distribution is not normal (tails too long).
Consider the sales of new one-family houses in the USA, Jan 1973 – Nov 1995 (data set
hsales).
- Produce some plots of the data in order to become familiar with it.
- Split the
hsalesdata set into a training set and a test set, where the test set is the last two years of data.
hsales.train = window(hsales, end=c(1993,11))
hsales.test = window(hsales, start=c(1993,12))
autoplot(hsales) +
autolayer(hsales.train, series="Training") +
autolayer(hsales.test, series="Test")
- Try using various benchmark methods to forecast the training set and compare the results on the test set. Which method did best?
h <- length(hsales.test)
m.f <- meanf(hsales.train, h=h)
rw.f <- rwf(hsales.train, h=h)
sn.f <- snaive(hsales.train, h=h)
rwd.f <- rwf(hsales.train, drift=TRUE, h=h)
autoplot(hsales) +
xlab("Year") +
ggtitle("Sales") +
autolayer(m.f$mean, col=2, series="Mean method") +
autolayer(rw.f$mean, col=3, series="Naïve method") +
autolayer(sn.f$mean, col=4, series="Seasonal naïve method") +
autolayer(rwd.f$mean, col=5, series="Drift method")## ME RMSE MAE MPE MAPE MASE ACF1 Theil's U
## Training set 3.5105e-15 12.1628 9.5327 -6.1449 20.383 1.12343 0.8662 NA
## Test set 3.8395e+00 9.0226 7.5616 4.7791 13.262 0.89113 0.5378 1.1317
## ME RMSE MAE MPE MAPE MASE ACF1 Theil's U
## Training set -0.0080 6.3011 5.0000 -0.76746 9.904 0.58925 0.18245 NA
## Test set 2.7917 8.6289 7.2083 2.85864 12.849 0.84950 0.53780 1.0984
## ME RMSE MAE MPE MAPE MASE ACF1 Theil's U
## Training set 0.10042 10.5822 8.4854 -2.18427 17.6337 1.0000 0.83698 NA
## Test set 1.04167 5.9055 4.7917 0.97202 8.5457 0.5647 0.16878 0.70915
## ME RMSE MAE MPE MAPE MASE ACF1 Theil's U
## Training set 1.5064e-15 6.3011 4.9999 -0.7511 9.9031 0.58924 0.18245 NA
## Test set 2.8917e+00 8.6588 7.2490 3.0426 12.9017 0.85430 0.53787 1.1003
In terms of accuracy measures on the test set, the seasonal naïve method does better.
- Check the residuals of your preferred method. Do they resemble white noise?
##
## Ljung-Box test
##
## data: Residuals from Seasonal naive method
## Q* = 682, df = 24, p-value <2e-16
##
## Model df: 0. Total lags used: 24
Residuals are correlated. They show some cyclic behaviour. They seem normal.